In this example, I will demonstrate the details involved in training a Neural Network to distinguish between 37 different classes of images.
#This allows us to store the plots in the output
%matplotlib inline
Import the fastai V1 library which sits on top of Pytorch 1.0.
#from fastai import *
from fastai.vision import *
from fastai.metrics import error_rate
We are going to use the Oxford-IIIT Pet Dataset by O. M. Parkhi et al., 2012 which features 12 cat breeds and 25 dogs breeds.
Our model will learn to differentiate between these 37 distinct categories. According to their paper, the best accuracy they could get in 2012 was 59.21%, using a complex model that was specific to pet detection, with separate "Image", "Head", and "Body" models for the pet photos.

First we download the data (almost 800MB) and setup the path variables.
path = untar_data('https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet');
path
path.ls()
images_path = path/'images'
Looking at the images, in the folder we can see that the labels are part of the image names.
imagefilenames = get_image_files(images_path)
imagefilenames[:5]
Then we extract the labels from the filenames.
ImageDataBunch.from_name_re gets the labels from the filenames using a regular expression.
Note: Image size 224 is a recommended image size. According to the documentation, images of multiple sizes of 7 work best.
np.random.seed(2) #By setting the seed, we guarantee the validation set will be the same every time
regex_pattern = r'/([^/]+)_\d+.jpg$'
batchsize = 64 #half or quarter this number if you run our of memory on the GPU
data = ImageDataBunch.from_name_re(
images_path, #the local path to our images.
imagefilenames, #the filenames of all images.
regex_pattern, #pattern for extracting the class names from the filename.
valid_pct=0.2, #this determines the size of our validation set
ds_tfms=get_transforms(), #Accept default image transformations (crop and center)
size=224, #This is the image size
bs=batchsize #how many images to train at once.
)
Then, we normalize the RGB colors so that they have a mean of ZERO and Standard Deviation of 1.
print(data.classes)
len(data.classes)
data.c
data.normalize(imagenet_stats)
data.show_batch(rows=3, figsize=(7,6))
We will use a convolutional neural network backbone and a fully connected head with a single hidden layer as a classifier.
PyTorch includes pre-trained models
resnet34 Training Model - Pretrained Data - Faster for testing
resnet50 Training Model - Pretrained Data - Slower, use for best results!
Note: Think of these as an excelent set of predefined weights for the Neural Net. Training ontop of these weights has proven to provide siginificant training advantage.
If you are wondering why we should use the resnet50 model, please see the standford benchamrks for image traning models
#Create a learn object that will store our learning epochs
#models.resnet34 will use the generically trained resent34 neural network
#pretrained:bool=True This will download a set of pretrained weights based on the selected model
learn = cnn_learner(data, models.resnet50, pretrained=True, metrics=[error_rate,accuracy])
#when doing proof of concept, train using resnet34, it is very good and much faster.
#learn = cnn_learner(data, models.resnet34, pretrained=True, metrics=error_rate)
This is the concept of Transfer Learning. By starting from a pretrained neural network, we take a model that knows how to do something very well, and make it do our thing (classifying breeds of cats and dogs) very well!
This method of transfer learning allows improved speeds of 1/100th time of training and also 1/100th reduction in dataset size!
#Run learn.model to see some details about the model
#learn.model
learn.lr_find()
learn.recorder.plot()
NOTE: I will be using fit_one_cycle instead of fit. This is due to a recent (2018) research paper that discovered a much faster way to do the fit cycle.
learn.fit_one_cycle(4)
Epoch: A complete training itiration using the full image set
We want our Train_Loss to be LESS than our valid_loss
Valid_Loss:
Error_Rate: We use a validation set to make sure we are not over fitting. The validation set is a set of images that the model does not get to look at. Error_Rate is calcualted based on testing the validation set.
#This is going to put the model in a subdirectory where the data came from
learn.save('stage-1')
At this point, we have completed training our model and we can start using it
We create an interpreter using the learner, which at this point contains the data and the trained model
interpreter = ClassificationInterpretation.from_learner(learn)
interpreter.plot_confusion_matrix(figsize=(12,12), dpi=60)
The confusion matrix shows that our classifier has trouble with:
Otherwise, it is doing a pretty good job!
Above, I looked at the matrix to figure out the highest errors.
This will give us a printed list of the most confused. I found this function very useful along with the matrix.
interpreter.most_confused(min_val=3)
The best feature of all, is the ability to see the images that failed the test. Looking at these we can gather an idea of why it failed to correctly classify them.
#doc(interpreter.plot_top_losses)
interpreter.plot_top_losses(16, figsize=(25,25), heatmap=False)
Here are the images again with heatmap turned ON. The heat map tells us which part of the image the model was using to make its classification! This is also very useful
interpreter.plot_top_losses(16, figsize=(25,25), heatmap=True)
(Submitted on 12 Nov 2013 (v1), last revised 28 Nov 2013 (this version, v3))
In Layer one, simple lines and gradients are understood by the Neural Net

In Layer 2: understands corners, curves, circles, simple patterns

Layer 3: can find repeating patterns of objects, like text, faces, wheels, ...etc

As we go deeper and deeper, the pre-trained neural net can recognice more and more generic shapes and groups. The net already knows the difference between a dog an a cat, and has a built in conception of many different types of dogs and cats, but since it was not trained specifically on how to distinguish between them, initially it can not.
Our training builds ontop of this knowledge. What we do, is we replace the deepest layers with our trained layers.
Bt default, training affects all layers. Knowing what we know about this neural net, it would be of little benefit to retrain the top level layers since they are already very good at what they do. In order to leverate transfer learning we will be applying a scaled back propagation weighting system that will focus more on the deepest layers and less and less as it moves up to the top most layers.
Now, with my deeper understanding of pre-trained CNNs, I will try to fine tune this training to get even better results.
#This sets every layer group to trainable
learn.unfreeze()
#First I will demonstrate how training all layers is make our model worse and increase the error_rate
learn.fit_one_cycle(1)
Our error_rate was 0.062923, after training all layers, it is now 0.097429.
That is about 4% increase in errors! In a resnet34 example, the error rate could increase by 8-10%
Let us roll back this training session
learn.load('stage-1');
To learn more about how this data set affects this model, we use the following functions to plot the learning rate graph.
learn.lr_find()
learn.recorder.plot()
Notice how at the very end the line shoots up? This is basically telling us that our loss will increase if we train at learning rates greater than 1e-4. By default, the learning rate is 0.003. Which is too much for our data.
Note about reading e to the negative numbers:
1e-4 = 0.0001
1e-3 = 0.001
learn.unfreeze()
learn.fit_one_cycle(4, max_lr=slice(1e-6,3e-3))
learn.save('stage-2')
learn.fit_one_cycle(4, max_lr=slice(1e-6,1e-3))
As you can see, further training yielded worse results. This continues on
#Undo the latest learning results
#learn.load('stage-2');
learn.lr_find()
learn.recorder.plot()
learn.fit_one_cycle(8, max_lr=slice(1e-6,3e-5))
#learn.load('stage-2');
learn.save('stage-3')
interpreter = ClassificationInterpretation.from_learner(learn)
interpreter.most_confused(min_val=3)
I am very happy with these results!
This will create a file named 'export.pkl' in the directory where we were working that contains everything we need to deploy our model (the model, the weights but also some metadata like the classes or the transforms/normalization used).
#this will give you the path of your working directory
source = learn.path
#this will create the trained model file that can be used in our API
learn.export()
source = source/'export.pkl'
source
Now copy the file into our McMaster directory and reaname to 'breedsofcatsanddogs.pkl'
from shutil import copyfile
copyfile(source, '/data/home/abdqeb/notebooks/McMaster/breedsofcatsanddogs.pkl')
Then download and replace the file into onedrive/Public folder